Using Microsoft SQL Server platform for plagiarism detection

نویسندگان

  • Vladislav Shcherbinin
  • Sergey Butakov
چکیده

The paper presents an approach for plagiarism detection using Microsoft SQL Server platform in a large corpus of documents. The approach was used for participation in the first international plagiarism detection competition that was held as a part of PAN’09 workshop. The main advantages of the proposed approach are its high precision, good performance and readiness for deployment into a production environment with relatively low cost of the required third party software. The approach uses fingerprinting-based algorithm to compare documents and Levenstein’s metric to markup plagiarized fragments in the texts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Health Ontology Generator: Design And Implementation

This paper presents the design and implementation of a Health Ontology Generator (HOG) using a health database such as Microsoft Access or SQL Server. The development of the ontology generator involves building methods for creating and reading the ontology. This research performs both these tasks. In generating the ontology, database tables are treated as classes, fields as functional propertie...

متن کامل

SQL Server Megaservers: Scalability, Availability, Manageability

Microsoft® SQL ServerTM has evolved to support huge databases and applications, including multiterabyte databases used by millions of people. SQL Server achieves this scalability by supporting scale up on symmetric multiprocessor (SMP) systems, allowing users to add processors, memory, disks and networking to build a large single node, as well as scale out on multinode clusters, allowing a huge...

متن کامل

Geospatial Stream Query Processing using Microsoft SQL Server StreamInsight

Microsoft SQL Server spatial libraries contain several components that handle geometrical and geographical data types. With advances in geo-sensing technologies, there has been an increasing demand for geospatial streaming applications. Microsoft SQL Server StreamInsight (StreamInsight, for brevity) is a platform for developing and deploying streaming applications that run continuous queries ov...

متن کامل

Construction of Agricultural Products Logistics Information System Based on .Net and Wap

Functions and construction of agricultural products logistics system based on .NET and WAP technology are introduced in detail. The problems encountered during the process of system development and corresponding solutions are also illustrated. The Windows 2003 Server and SQL Server 2005 serve as the platform and background database server respectively, and windows are designed using the ASP.NET...

متن کامل

A workflow mining approach for deriving software process models

Technical Skills • Programming Languages: Java, Javascript, C++, C, PL/SQL, XML, XSLT, HTML, Groovy, Scala, Pascal, Delphi, Prolog, Lisp, Assembler, Visual Basic, Perl, Shell, etc... • Component Architectures: Java EE (J2EE JEE6), OSGi /Equinox/Felix, Spring, Corba, Quasar • Java Libraries and Frameworks: Eclipse RCP, Eclipse RAP, Swing, Awt, JSF/Facelets, JavaFX, EJB 2.* 3.*, JMS, JAXB, XStrea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009